Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies.

نویسندگان

  • Chengliang Dong
  • Peng Wei
  • Xueqiu Jian
  • Richard Gibbs
  • Eric Boerwinkle
  • Kai Wang
  • Xiaoming Liu
چکیده

Accurate deleteriousness prediction for nonsynonymous variants is crucial for distinguishing pathogenic mutations from background polymorphisms in whole exome sequencing (WES) studies. Although many deleteriousness prediction methods have been developed, their prediction results are sometimes inconsistent with each other and their relative merits are still unclear in practical applications. To address these issues, we comprehensively evaluated the predictive performance of 18 current deleteriousness-scoring methods, including 11 function prediction scores (PolyPhen-2, SIFT, MutationTaster, Mutation Assessor, FATHMM, LRT, PANTHER, PhD-SNP, SNAP, SNPs&GO and MutPred), 3 conservation scores (GERP++, SiPhy and PhyloP) and 4 ensemble scores (CADD, PON-P, KGGSeq and CONDEL). We found that FATHMM and KGGSeq had the highest discriminative power among independent scores and ensemble scores, respectively. Moreover, to ensure unbiased performance evaluation of these prediction scores, we manually collected three distinct testing datasets, on which no current prediction scores were tuned. In addition, we developed two new ensemble scores that integrate nine independent scores and allele frequency. Our scores achieved the highest discriminative power compared with all the deleteriousness prediction scores tested and showed low false-positive prediction rate for benign yet rare nonsynonymous variants, which demonstrated the value of combining information from multiple orthologous approaches. Finally, to facilitate variant prioritization in WES studies, we have pre-computed our ensemble scores for 87 347 044 possible variants in the whole-exome and made them publicly available through the ANNOVAR software and the dbNSFP database.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Multiple Genomic Data to Predict Disease-Causing Nonsynonymous Single Nucleotide Variants in Exome Sequencing Studies

Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs) for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and ...

متن کامل

iFish: predicting the pathogenicity of human nonsynonymous variants using gene-specific/family-specific attributes and classifiers

Accurate prediction of the pathogenicity of genomic variants, especially nonsynonymous single nucleotide variants (nsSNVs), is essential in biomedical research and clinical genetics. Most current prediction methods build a generic classifier for all genes. However, different genes and gene families have different features. We investigated whether gene-specific and family-specific customized cla...

متن کامل

Genetic Risk Prediction for Normal-Karyotype Acute Myeloid Leukemia Using Whole-Exome Sequencing

Normal-karyotype acute myeloid leukemia (NK-AML) is a highly malignant and cytogenetically heterogeneous hematologic cancer. We searched for somatic mutations from 10 pairs of tumor and normal cells by using a highly efficient and reliable analysis workflow for whole-exome sequencing data and performed association tests between the NK-AML and somatic mutations. We identified 21 nonsynonymous si...

متن کامل

Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel.

Several large ongoing initiatives that profit from next-generation sequencing technologies have driven--and in coming years will continue to drive--the emergence of long catalogs of missense single-nucleotide variants (SNVs) in the human genome. As a consequence, researchers have developed various methods and their related computational tools to classify these missense SNVs as probably deleteri...

متن کامل

Whole Exome Sequencing Reveals a BSCL2 Mutation Causing Progressive Encephalopathy with Lipodystrophy (PELD) in an Iranian Pediatric Patient

Background: Progressive encephalopathy with or without lipodystrophy is a rare autosomal recessive childhood-onset seipin-associated neurodegenerative syndrome, leading to developmental regression of motor and cognitive skills. In this study, we introduce a patient with developmental regression and autism. The causative mutation was found by exome sequencing. Methods: The proband showed a gener...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Human molecular genetics

دوره 24 8  شماره 

صفحات  -

تاریخ انتشار 2015